16 research outputs found
Superresolution without Separation
This paper provides a theoretical analysis of diffraction-limited
superresolution, demonstrating that arbitrarily close point sources can be
resolved in ideal situations. Precisely, we assume that the incoming signal is
a linear combination of M shifted copies of a known waveform with unknown
shifts and amplitudes, and one only observes a finite collection of evaluations
of this signal. We characterize properties of the base waveform such that the
exact translations and amplitudes can be recovered from 2M + 1 observations.
This recovery is achieved by solving a a weighted version of basis pursuit over
a continuous dictionary. Our methods combine classical polynomial interpolation
techniques with contemporary tools from compressed sensing.Comment: 23 pages, 8 figure
The geometry of kernelized spectral clustering
Clustering of data sets is a standard problem in many areas of science and
engineering. The method of spectral clustering is based on embedding the data
set using a kernel function, and using the top eigenvectors of the normalized
Laplacian to recover the connected components. We study the performance of
spectral clustering in recovering the latent labels of i.i.d. samples from a
finite mixture of nonparametric distributions. The difficulty of this label
recovery problem depends on the overlap between mixture components and how
easily a mixture component is divided into two nonoverlapping components. When
the overlap is small compared to the indivisibility of the mixture components,
the principal eigenspace of the population-level normalized Laplacian operator
is approximately spanned by the square-root kernelized component densities. In
the finite sample setting, and under the same assumption, embedded samples from
different components are approximately orthogonal with high probability when
the sample size is large. As a corollary we control the fraction of samples
mislabeled by spectral clustering under finite mixtures with nonparametric
components.Comment: Published at http://dx.doi.org/10.1214/14-AOS1283 in the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Learning Developmental Landscapes from Single Cell Gene Expression with Optimal Transport
Non UBCUnreviewedAuthor affiliation: MITPostdoctora
Sparse Inverse Problems: The Mathematics of Precision Measurement
The interplay between theory and experiment is the key to progress in the natural sciences. This thesis develops the mathematics of distilling knowledge from measurement. Specifically, we consider the inverse problem of recovering the input to a measurement apparatus from the observed output. We present separate analyses for two different models of input signals. The first setup is superresolution. Here, the input is a collection of continuously parameterized sources, and we observe a weighted superposition of signals from all of the sources. The second setup is unsupervised classification. The input is a collection of categories, and the output is an unlabeled set of objects from the different categories. In Chapter 1 we introduce these measurement modalities in greater detail and place them in a common framework.Chapter 2 provides a theoretical analysis of diffraction-limited superresolution, demonstrating that arbitrarily close point sources can be resolved in ideal situations. Precisely, we assume that the incoming signal is a linear combination of M shifted copies of a known waveform with unknown shifts and amplitudes, and one only observes a finite collection of evaluations of this signal. We characterize properties of the base waveform such that the exact translations and amplitudes can be recovered from 2M + 1 observations. This recovery can be achieved by solving a weighted version of basis pursuit over a continuous dictionary. Our analysis shows that l1-based methods enjoy the same separation-free recovery guarantees as polynomial root finding techniques such as Prony’s method or Vetterli’s method for signals of finite rate of innovation. Our proof techniques combine classical polynomial interpolation techniques with contemporary tools from compressed sensing.In Chapter 3 we propose a variant of the classical conditional gradient method (CGM) for superresolution problems with differentiable measurement models. Our algorithm combines nonconvex and convex optimization techniques: we propose global conditional gradient steps alternating with nonconvex local search exploiting the differentiable observation model. This hybridization gives the theoretical global optimality guarantees and stopping conditions of convex optimization along with the performance and modeling flexibility associated with nonconvex optimization. Our experiments demonstrate that our technique achieves state- of-the-art results in several applications.Chapter 4 focuses on unsupervised classification. Clustering of data sets is a standard problem in many areas of science and engineering. The method of spectral clustering is based on embedding the data set using a kernel function, and using the top eigenvectors of the normalized Laplacian to recover the connected components. We study the performance of spectral clustering in recovering the latent labels of i.i.d. samples from a finite mixture of nonparametric distributions. The difficulty of this label recovery problem depends on the overlap between mixture components and how easily a mixture component is divided into two nonoverlapping components. When the overlap is small compared to the indivisibility of the mixture components, the principal eigenspace of the population-level normalized Laplacian operator is approximately spanned by the square-root kernelized component densities. In the finite sample setting, and under the same assumption, embedded samples from different components are approximately orthogonal with high probability when the sample size is large. As a corollary we control the fraction of samples mislabeled by spectral clustering under finite mixtures with nonparametric components
Trajectory Inference via Mean-field Langevin in Path Space
Trajectory inference aims at recovering the dynamics of a population from
snapshots of its temporal marginals. To solve this task, a min-entropy
estimator relative to the Wiener measure in path space was introduced by
Lavenant et al. arXiv:2102.09204, and shown to consistently recover the
dynamics of a large class of drift-diffusion processes from the solution of an
infinite dimensional convex optimization problem. In this paper, we introduce a
grid-free algorithm to compute this estimator. Our method consists in a family
of point clouds (one per snapshot) coupled via Schr\"odinger bridges which
evolve with noisy gradient descent. We study the mean-field limit of the
dynamics and prove its global convergence to the desired estimator. Overall,
this leads to an inference method with end-to-end theoretical guarantees that
solves an interpretable model for trajectory inference. We also present how to
adapt the method to deal with mass variations, a useful extension when dealing
with single cell RNA-sequencing data where cells can branch and die